Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Unsupervised Preference Optimization
# Unsupervised Preference Optimization
Mistral Orpo Beta
MIT
Mistral-ORPO-β is a 7B-parameter language model fine-tuned using the ORPO method based on Mistral-7B, capable of directly learning preferences without a supervised fine-tuning warm-up phase.
Large Language Model
Transformers
English
M
kaist-ai
18
38
Featured Recommended AI Models
Empowering the Future, Your AI Solution Knowledge Base
English
简体中文
繁體中文
にほんご
© 2025
AIbase